DOCUMENT RESUME 



ED 048 333 



TS 000 393 



AUTHOR 

TITLE 

INSTITUTION 
PUB DATE 
NOTE 



St u t f lebea ic , Daniel I. 

Evaluation as Enlightenment for Decision-Making, 
Ohio State Univ., Columbus, EvaJ.uation Center. 

19 Jan 08 
52p- 



EDRS PRICE EDRS Price MF-$0.65 HC-J3.2* 

DESCRIPTORS ^Decision Making. Educational Accountability, 

Educational Change, * Evaluation, Evaluation 
Criteria, *E valuation Needs, ^Evaluation Techniques, 
Guides, Models, Program Evaluation, ^Research 
Design, Research Problems 

IDENTIFIERS ^Elementary Secondary Education Act Title I, Title 

III 



ABSTRACT 



T:ie need for competent formal evaluation programs, 
particularly tor new federally assisted programs, is expressed. 
Problems in defining educational evaluation and its r eg ui rements . in 
designing such evaluations, and 3 possible sources of faulty 
conceptual bases for evaluations, are presented. An attempt is made 
to define evaluation in general, to analyze emergent problems of 
educational change, and to identify the types of decisions for which 
evaluations are needed in these programs. Four strategics ^ or 
evaluating educational programs are outlined. There include context, 
input, process, and product evaluation, each of which are used at a 
distinct, stage in the development of a program. Finally, a genera] 
guide for developing evaluation designs to implement a given 
evaluation program is provided. The logical structure of evaluation 
design is presented in a step by step format. (PR) 



EDO 48333 



J S DIPAHTML' ' OF HEALTH. EDUCATION 
& WELFARE 

OFFICE OF EDUCATION 
THIS OOCUMENT HAS BEEN RZPRODJ^FD 
EKACTLY AS RECEIVED FROM THE PERSON Oh 
ORGArVZATlON ORIGINATING IT POINTS OF 
VIEW OR OPINIONS jTATEC DO NOT NECIS 
sarily represent official office of edu 
cation position or policy 



la 13.70 



EVALUATION AS ENLIGHTENMENT FOR DECISION-MAKING 



Daniel L. Stuff lebeam 



An Address Delivered at the 
Working Conference on Assessment Theory 
Sponsored by 

The Commission on Assessment of Educational Outcomes 
The Association for Supervision and Curriculum Development 
Sarasota, F’orlda 
January 19* 1968 





THE OHIO STATE UNIVERSITY 
College of Education 



TM 000 393 



i 




a 



') 



The EVALUATION CENTER, an agency of the College of 
Education, is committed to advancing the science and 
practice of educational evaluation. More specifically, the 
purpose of the Center is to increase education’s capability 
to obtain and use information for planning, programming, 
implementing and evolving educational activities. To serve 
this purpose, the Center’s interdisciplinary team engages 
jn research, development, instruction, leadership and 
service activities. 

HISTORY 

The origin of the present Center traces back to 
the establishment of the Ohio State University Test Devel- 
opment Center >n 1962. Due to the urgent need for a more 
comprehensive approach to 'evaluation' than that afforded 
by standardized testing, the Test Development Centar was 
expanded ir, 1965 into the present Evaluation Center which 
is concerned with many modes of evaluation in addition to 
standardized testing. However, test development remains 
an important part of the Evaluation Center piogram. 

GOALS 

The broad objectives of the currently constituted 
Center are: 

to increase scientific knowledge of educational evaluation 
and planning; 

to develop evaluation strategies and designs; 
to develop evaluation methods ai d materials; 
to provide instruction in evaluation; 
to disseminate information related to educational evafua 
tion; 

to assist educationists in evaluating the r programs. 
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ORGANIZATION 

To serve its complex objt tives, the Center has 
developed an interdisciplinary team. Currently, the staff 
of the Center consists of fifty-four members, including 
five professors ’ positions, plus a varying number of visiting 
faculty. The staff and visiting professors bring expertise 
from the fields of economics, educat'on (administration, 
curriculum and supervision, elementary and secondary 
school teaching, evaluation, mathematics, planning, re- 
search methodology, am' tests and measurement), psy- 
chology, sociology, systems analysis, and urban planning. 
The Center is organized into four divisions; Administration 
and Progiam Development; Leadership in Evaluation; Re- 
search in Evaluation; and Test Development. The Center 
is administered by a director and an associate director 
for each division. 
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INTRODUCTION 



Chairman Beatty, and ladles and gentlemen: It is a pleasure to be 

here; and I appreciate the opportunity to test some of my Ideas about 
educational evaluation before this distinguished group. 

For the past two and one-half years I have been heavily engaged in 
evaluation activities with personnel from local schools, state education 
departments, and the United States Office of Education. Those activities, 
for the most part, have involved efforts to evaluate projects funded under 
Title I and Title III of the Elementary and Secondary Education Act of 
1965. This pap'T Is based on those experiences and Is an attempt to sum- 
marize some of my ideas about the kinds of evaluation which are needed in 
current programs of educational change. 

The paper is divided Into two parts. Part 1 ic concerned mainly with 
determining the present state of the art in educational evaluation, In 
this part, I have attempted to describe current requirements for education- 
al evaluation, to illustrate that educators have thus far been Ineffectual 
In their attempts to meet these requirements, and to point out some possi- 
ble reasons for poor evaluations In education. In Part 2 of the paper, I 
have attempted to conceptualize some alternative approaches to educational 
evaluation. This second part of the paper Includes attempts to define 
evaluation in general terms, to sketch four evaluation strategies which I 
think have particular relevance to educational change activities, and to 
explicate the structure of evaluation design, 

8efore proceeding to present the body of the paper, I want to empha- 
size that my formulations are largely untested and are therefore highly 
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tentative. I sincerely hope that you find these rough Ideas worthy of 
your examination, If you find any of them to be viable, i hope that you 
will help me, both during and after this working conference, to refine 
and extend them. Without further introduction, let me proceed with the 
presentation of Part I. 
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Part J_: JThe S tate of the Art _[n F.ducat Iona 1 fcvaluat ion 
The Setting 

Education is becoming increasingly valued as a means to meet 
the social and economic as well as the intellectual needs of 
society. To fulfill this expanding role, educators are being 
asked to deal with many critical societal problems. Those in- 
clude Inequality of opportunities among racial groups, de facto 
segregation, riots in our cities, disillusionment of youth, and 
school dropouts. Clearly, the rising trend of these problems mu?r. 
be curbed and pushed back for the welfare of our civilization. 
Education is thus being given a most urgent and difficult charge, 
and to meet this charge educators must mount many new ar.d inno- 
vative efforts. 

To help educators meet their new responsibilities, society is 
annually providing billions of dollars through federal, state a j 
foundation programs to education agencies at all levels. Examples 
of increased support to education Include the Elementary and Second- 
ary Education Act of 1 965 » the Headstart Program, the Education 
Professions Act, and the Experienced Teacher Fellowship Program. 

Many Industries are also developing education components, and soon 
we will probably see many new education-industry combines and con- 
sortia, Clearly, in addition to new responsibilities education also 
has unprecedented opportunities co improve and expand its programs. 
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These opportunities, however, have elso carried requirements 
that educators evaluate their new plans and programs. These require 
ments are especially evident In new federal assistance programs, e,g 
Title I and Title Ml of the Elementary and Secondary Education Act, 
Here, ihe law explicitly states that fu.id recipients will make at 
least annual evaluation reports. As a consequence, many educators 
at all levels for the first time are having to cope with require- 
ments for formal evaluation. 

Such requirements for evaluation seem reasonable; and, in my 
judgment, they are long overdue. Funding agencies and the public 
have the right to know whether their huge expenditures for educa- 
tion are producing the desired effects. Even more important than 
this, educators themselves need evaluative information to provide 
rational bases for their decisions among alternative plans and 
procedures. However, to justify requirements for evaluation is not 
to operationalize them. Educators must repond to the requirements, 
and they must do so effectively. 

The Need for Better Educational Evaluations 

Without question, educators are responding to requirements for 
evaluation. The multitude of evaluation reports now available from 
local schools, state education deportments, regional educational 
laboratories, etc, demonstrates that educators are expending signif- 
icant amounts of time, effort, and money to evaluate their programs. 
However, the increased activity alone has not met the need for 
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effective evaluations. While educators have been busy doing evaluations, 
the fruits of their efforts have not provided the Information needed to 
support declslon-maklng related to the programs being evaluated. 

Many of the completed evaluation reports contain only Impression- 
istic Information, Though such Information may be pertinent to the 
concerns of decision-makers, It usually lacks the level of credibility 
required by decision-makers to defend their decisions, and seldom can 
such Information be of material use in making important decisions. A 
case in point Is the first annual report for Title I of The Elementary 
and Secondary Education Act.^ This report was highly Important since 
It encompassed the thousands of Title I projects throughout the nation. 
However, It fell far short of being a useful document, for It was al- 
most devoid of hard data. On the other hand, It did contain many anec- 
dotal accounts wherein persons who were responsible for conducting 
Title I activities stated that they felt that their program was being 
successful; and many of them speculated as to the reasons for the al- 
leged successes. Though these anecdotes may have touched key Issues 
related to the Improvement of the billion dollar per yeai Title I 
program, decision-makers In the Congress, the Office of Education, 
state education departments, and local school districts could hardly 
base Important decisions on a few "possibly accurate" places of testi- 
mony. 



Vbl lc Law 89-10: 
of 1965, Title t. 



The Elementary and Secondary Education Act 
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The situation is not much different In Title 1 1 1 of the Elementary 
and econdary Education Act. Title 111 staff members in the U. S, 
Office of Education have continuously ranked the quality of Title 
II! projects cn a five point scale for each of fifteen criteria. 1 ' 

The criterion relating to evaluation has consistently been ranked 
near che "poor" end of the scale and lower than thirteen of the other 
cr I ter I a-* the exception being the criterion related to dissemination. 
Guba has also suggested that evaluation plans In Title 111 projects 
are weak.^ Based on his analysis of thirty-two Title III projects, 

Guba concluded that "It Is very dubious v;hether the results of these 
evaluations will be of much use to anyone. They are likely to fit 
well, however, Into the conventional school man's stereotype of what 
evaluation is: something required from cn high that taked tire and 

pain to produce but which has very little significance for action," 
Unlike the Title I and Title 111 evaluations referenced above, 
some evaluations provide for hard data. For example, the evaluation 
report for New York City's Higher Horizons Program** used rigorous 



2 

These criteria are listed on pp. 70-71 of the current Title ill 
guide! Ines. (A Manual for Project Appl Ican ts and Grantees , Washington, 
D. C, ; U, S. Office of Education, 1967.) 

^Egon G. Guba. Eva lua t ion and the Process of Change t Notes and 
Working Papers Concerning the Administration of Programs authorized 
under Title I 11 of Public Law 89*10, The Elementary and Secondary 
Education Act of 1965 as amended by Public Law 89*750, April 1967, 
p. 312. 

Nbld 

Swayre J, Wrlghtstone, et al . Evaluat Ion of the Higher Horizon s 
Program for Underprivileged Ch 1 Idren , Cooperative Research Project 
No, 1124, Bureau of Educational Research, Board of Education of the 
City of New York. 
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research procedures to compare the perfornance of an experimental group 
receiving the Higher Horizons Program with the performance of a control 
group which was matched to the experimental group on several counts. The 
basic conclusions contained In this nearly 300 page report were typical 
of findings for rigorous educational evaluations: "there were no signi- 

ficant differences, 11 In sharp contrast, however, the report also noted 
that the teachers and principals who had been involved in the program 
said that it was making differences so significant that the program simply 
could not be abandoned. 

Though the Title I, Title III and Higher Horizons evaluations differed 
as to rigor, they were alike in one respect. None of them provided much 
help to the dec! s ion-maker for improving the programs being evaluated, 

While I have cited only three examples of the deficiencies in current 
evaluations, I Inir.k they are sufficiently weighty ones to Illustrate my 
point, in too many coses, evaluation reports provide little or no help 
to decision-makers, and decision-making in and about education must remain 
an arty endeavor. 

Problems in Educational Evaluation 

What is th'.^ explanation for this situation? Why is it that educators 
are falling to provide evaluations v/hich are at the same time useful end 
scientifically respectable? Why is it that evaluations which adhear to 
classical research methods provide Information which is of only limited 
help In making decisions about programs, and why do the typical "no signi- 
ficant difference" findings In so many of these evaluations contravene the 
experiences of those who are intimately Involved in the programs? 

One cannot rnswer these questions >imply on the grounds that evalua- 
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tlon practice lag? too far uehlnd evaluation theory, or that thore Is a 
lack of effort on the part of educators to evaluate their programs. 

Further, It is not enough t:o note that evaluation testimony given by wit- 
nesses is not credible, or that typical findings of no significant dif- 
ferences are correct because nothing in education ever makes a difference. 
Rather, I think the lack of adequate evaluation Information persists be- 
cause of several fundaments! problems which must be solved before educa- 
tors can Improve the!* evaluations. These include a lack of trained 
evaluators, a lack of appropriate evaluation instruments and procedures, 
and a lack of adequate evaluation theory. In my judgment, the most basic 
of these problems Is a lack of adequate theory or conceptua ifzat ions per*' 
talnlng to the nature of evaluations which are needed to accommodate 
educational programs, 

Clearly, the conceptual bases for evaluations are of fundamental Im- 
portance, If these conceptions are faulty, then the evaluations which are 
based on them must also be faulty. Thus, It would seem highly Important to 
identify and examine the efficacy of conceptualizations which underlie 
current needs for evaluation as well as educators' attempts to me^t these 
needs. It will be useful to divide these conceptualizations Into three 
classes and to consider eech one separately. The three classes are: 

1, Conceptions of the nature of the educational programs for which 
evaluations are needed; 

2 , Conceptions of the nature of evaluation, In general, and as 
related to specific classes of educational programs; an! 

3, Conceptions of the structure of evaluation designs needed to 
conduct educational evaluations. 



M 



9 



O 

ERIC 



Problems In Oef in ing Requirements for Ed uca t ion^l Eva) uat fons 

First, let us examine problems Involved In providing an adequate 
focus for educational evaluation studies. ObvIousN, to evaluate, one 
must know what is to be evaluated* Gaining knowledge of what Is to be 
evaluated, however, is currently a difficult task at best. Current needs 
for educational evaluation have arisen due to programs and activities 
which are new to the field of education, Such activities involved respon- 
sibilities newly assigned to educators, new kinds of relationships among 
different kinds and levels of agencies, and a need for cooperative decision- 
making about education among a variety of education and non-education agen- 
cies. It should come as no shock If the evaluation theory which has tradi- 
tionally been viewed os appropriate for education is found no longer to be 
adequate to meet the Information requirements in new educational programs. 
Clearly, many of the new programs In education are dramatically different 
from those cf the past; and our evaluations should probably be geared to 
answer questions which are much different from those they have answered in 
the past. 

Whet we need, l think, are conceptualizations to account for decision 
processes and Information requirements In new educational programs. Pro- 
grams to Improve education depend heavily upon a variety of decisions, and 
a variety of Information Is needed to make and support those decisions. 
Evaluators charged with providing this Information must hrvs equarc know- 
ledge about the relevant decision processes and associated , re- 
quirements before they can design adequate evaluations, They . ~ " have 

knowledge about the locus, focus, timing, and criticality of decisions to 
be served. At present no adequate knowledge of decision processes end 
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associated information requirements relative to educational programs 
exists. Nor Is th?re any ongoing program to provide this knowledge,, In 
short, there ^re no adequate conceptualizations of decisions and associa- 
ted information requirements or programs to produce theim, 

Probl ems In Oaf 1 n 1 nq Educat Iona 1 Eval uat Io n 

Next, let us attend to problems pertaining to the meaning of educa- 
tional evaluation. Usually educators have defined evaluation as the 
science of determining the extent to which objectives have been achieved, 
Ths first step in operationalizing this definition is to state program 
obj' :tives in behavioral terms. Then one must define and operationalize 
criteria fo H use in relating outcomes to the objectives. Operationalizing 
such criteria includes the specification of instruments for measuring out- 
comes and standards for use in assigning values to the measured outcomes. 
Standards may be either in absolute or relative terms. An absolute stan- 
dard might be that students on the average should achieve at least sone 
specified score on a selected achievement test. A relative standard might 
be that the group of students receiving a new program should achieve scores 
on a selected achievement test which on the average are higher than scores 
achieved by an equivalent group of students which received some alternative 
program. Regardless of the type of evaluative standard used, the date; from 
such studies are analyzed after a complete cycle of the program to deter- 
mine the extent to which the objcxtlves were achieved. 

Evaluations based upon the above definition of evaluation yield data 
about gross total program ejects and then only In retrospect. Such data 
are useful for making judgments about a project aftei it ^as run full 
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cycle, but they certainly are not adequate to assist educators in the ini- 
tial planning and In the actual carrying through of programs. At best, 
therefore, such evaluations provide an insufficient solution to the evalua- 
tion problems of educators who must plan and execute Innovative programs. 

The inadequacy of extant conceptions of evaluation is illustrated by 
the following excerpt from testimony pertaining to Title I evaluations 
given before a Congressional committee by a ci t izens 1 group in New York City: 

We ask for amendments to render the required evaluations of Title 
I projects meaningful. The Act states that evaluations must be 
made, not that they be utilized in future planning. In New York 
City this year, projects were recycled before last year 1 *; evalua- 
tions were submitted. To be made more useful, evaluations should 
have built into them alternatives and the recommendations of the 
evaluator. What is now an expensive exercise should be made a 
function to provide service to local school boards having the re- 
sponsibility for making policy based on experience. American bus- 
iness would not survive if its consultants did not supply manage^ 
ment with alternatives after reviewing the efficacy of programs. 

here, the major concern seems to be that reports yielded by current evalu- 
ation programs are neither sufficiently specific nor timely to influence 
educational programs. Obviously, evaluations which do not at least meet 
these two criteria are of little use. 



P rob T ems jji Des Ign i nq Educat iona I Eval uat ions 

Finally, let us consider problems relating to the methodo’ogy 'f 
evaluation. If current conceptions of evaluation are not adequate for 
evaluating current eoucatlonal activities, neither can extant designs be 
adequate, For, existing Reans for evaluation have been developed to serve 



^ Citizens ' Committee for C hildren of N^w York. Inc . Ne wsletter , 
Statement of Mrs, Nathan W, levin* Chairmen of the Educational Services 
Section before the Sub-Connml ttee on the Elementary and Secondary Education 
Act of the Education and Labor Committee of the House of Representatives, 

March 18 , 1967. 
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the ends of evaluation as they have been conceived traditionally. 

The Inadequacy of extant evaluation methodology Is revealed when one 
examines the designs educators use to evaluate their programs. If they 
use a design at all, It typically Is an experimental design. The funda- 
mental concern of experimental design Is that date which are produced be 
Internally valid, l.e., unequivocal. Several conditions are necessary to 
meet this criterion. The units to be measured should be randomly assigned 
to treatment and control conditions. For example, a set of students might 
be partitioned randomly Into two groups--one to receive a new program, the 
other to receive the school's present offering in the area to be served by 
the new program. Next, the treatment and control conditions must be ap- 
plied and held constant throughout the period of the experiment, l.e., they 
must conform to the Initial definitions of these conditions, fhe new or 
traditional program conditions could not be modified In process, since In 
that event one could not tell whet was being evaluated. Also, all students 
In the experiment must receive the same amount of the treatment to which 
th2y are assigned; and care must be taken so that students receiving one 
treatment are not contaminated by the other treatment. If contamination 
occurred, one could not tell what had caused what after the project was 
completed. Therefore, until an experiment Is completed, one must resist 
the temptation to apply the successful activities of one condition to stu- 
dents receiving a different condition, even If the activities In the latter 
condition are obviously falling. Flnallv, an Instrument which Is valid 
and reliable for the specified criterion variable must be administered 
after a certain period of time* "usually a complete program cycle-- to sub- 
jects from both parts of the experiment. Then, If all of the above condl- 
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tlons were met , one could use predetermined statistical procedures and de- 
cision rules to determine unequ Ivocally that there were--or were not sig- 
nificant differences between the experimental and control groups on the 
outcome variable of interest. 

On the surface, the application of experimental design to evaluation 
problems seems reasonable, since traditionally both experimental research 
and evaluation have been used to test hypotheses about the effects of treat- 
ments. However, there are four distinct problems with this reasoning. 

First, the applicat i on of experimental design to evaluation problems 
conflicts with the principle that evaluation should facilitate the contin - 
ual Improvement of a progra m. Experimental design prevents rather than 
promotes changes In the treatment because treatments cannot be altered in 
process If the data about differences between treatments are to be unequi- 
vocal, Thus, the treatment must accommodate the evaluation design rather 
than vice versa; and the experimental design type of evaluation prevents 
rather than promotes changes In the treatment. It Is probably unrealistic 
to expect directors of Innovative projects to accept conditions necessary 
for applying experimental design. Obviously, they can't constrain their 
treatment to its original definition just to ensure Internally valid end- 
of-year evaluative data. Rather, project directors must use whatever evi- 
dence they can obtain to continually refine end sometimes radically change 
both the design and Its Implementation, It Is thus contended here that 
conceptions of evaluation are needed which would result In evaluation pro- 
grams which would stimulate rather than stifle dynamic development of pro- 
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A second flaw In the exp e rimental design type of evaluation Is that ft 
Is useful for making decisions afcer a project has run full cycle but al- 
most useless as a device for making decisions during the planning and i m- 
plementation of a project . It provides data after the fact about the rela- 
tive effectiveness of two or more treatments. Such data, however, are 
neither sufficiently specific and comprehensive nor are they provided at 
appropriate times to assist the decision-maker to determine what a project 
should accomplish, how It should be designed, or whether the project acti- 
vities should be modified In process. At best, experimental design evalua- 
tion reflects post hoc on whether a project did whatever It was supposed to 
do. At that time, however, It Is too late to make decisions about plans 
and procedures which have already largely determined the success or failure 
of the project, 

Guba? has pointed out a third problem with the experimental design 
ty peof evaluation; It Is well suited to the antiseptic conditions of the 
.laboratory but rot the septic conditions of the classroom . The potential 
confounding variables must either be controlled or eliminated t* 'ough ran- 
domisation If the study results are to have Internal validity. However, In 
the typlcnl educational setting this Is nearly Impossible to achieve. For 
example, consider the following quotation from an evaluation report com- 
pleted by Julian Stanley: 

^Egon G, Guba. "Methodological Strategies for Educational Change, 11 
Paper presented to the Conference on Strategies for Educational Change, 
Washington, 0, C., November 8-10, 196$^ 
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Even If the program does have considerable cumulative Influ- 
ence on a person's career, this may be slow fn appearing and 
so Interactive with other influences that It cannot be dis- 
cerned clearly by the person himself or by others. 

Nevertheless, we must use whatever evidence that can be 
adduced to determine whether or not such programs are worth 
repeating and, if so, how they should be modified in order to 
be more effective. Ideally, in the experimental design sense, 
we should conduct the program as a controlled experiment, with 
a well-matched control group that does not attend the Insti- 
tute, and follow up both groups for quite a few years in order 
to determine how they diverge. If recruiting begins early 
enough and the applicant group Is able enough to provide both 
groups au a sufficiently high level, this might be done, though 
the "react! vity" of the disheartened rejectees, the self-ful- 
filling prophecy of the rejectees, and the Inability to con- 
trol the summer activities of the rejectees might undesirably 
affect the outcome of the experiment. Merely having on one's 
record the fact of attendii.g a certain prestigious program, 
like displaying one's Phi Beta Kappa key, might be a power- 
ful aid*,. Our chief way of evaluating the success of the 
program Is via reports from staff and participants, particu- 
larly the latter.® 

In the above quotation, Professor Stanley has pointed to many of the rea- 
sons why experimental design does not seem wel ! suited to evaluation prob- 
lems in education. In many innovative programs there clearly ere a multi- 
tude of confounding factors which simply cannot effectively be contro l led. 
The existence of potentially confounding factors such as those ni-med 
by Stanley gives rise to a fourth kind of problem inherent In the experi- 
mental design type of evaluation. While Internal validity may be gained 
^through the control of extraneous variables, such an achievement Is accorn - 
PJJs_h_ed at the expense of external validity . If the extraneous variables 
are tightly controlled, one can have much confidence In the findings per- 
taining to how an Innovation operates in a controlled environment. However, 




g 

Jul Ian Stanley. Benefits of Research Design ; A Pilot Study , Final 
Report, Project No. X-005 t Grant 0E5*10-272, U. S, Department of Health, 
Education and Vielfare, Office of Education, Bureau of Research, August 
1966 . 
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such findings may not at all be general i zab 1 e to the real world where the 
so-called extraneous variables operate freely. Clearly, It Is Important to 
know how educational Innovations operate under real world conditions. 

Thus far. In this paper, I have attempted to depict the state of the 
art In educational evaluation. To begin with, 1 pointed out that educators 
are being faced with many new and different requirements for evaluation. 
Then l attempted to establish that educators 1 attempts to meet these re- 
quirements thus far have been Ineffectual, Finally, I suggested that there 
are three types of conceptual problems which prevent educators from pro- 
viding effective evaluations. These are: 

1* a lack of understanding of decision processes and information 
requirements In current programs of educational change; 

2. the lack of a definition of educational evaluation which Is 
pertinent to emergent requirements for educational evaluation; 
and 

3. a lack of appropriate evaluation designs. 

In the remainder of this paper I shall attempt 3 response to these prob- 
lems by suggesting some alternative conceptions regarding the nature of 
educational evaluation. 
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Part Ik The Nature of Evaluation 

In Part 1, I attempted to define some of the current needs and 
problems In educational evaluation. Since this is a working con- 
ference, I should probably stop here Su that you could examine my 
statement of the problem and modify or replace It. After we had 
achieved agreement as to what the real problems are, we could then 
proceed to develop relevant solutions. However, I have been asked 
by the organizers of this conferc; ce to expose some of my Ideas 
regarding solutions for current evaluation problems as I see them. 

As I stated In Part I, I think the basic problem In educational eval- 
uation is a lack of adequate conceptualizations regarding a rationale 
for and the meaning of evaluation in the context of emergent programs 
of educational change. Thus, in the remainder of this paper, I shall 
propose some alternative conceptions regarding the nature of educa- 
tional evaluation. I am acutely aware, however, of the tentative and 
untested nature of my formulations. I present these ideas to you 
In a heuristic spirit In the hope that you will help me examine and 
refine them. 

This part of the paper Is divided Into four major sections. The 
first section Is an attempt to define evaluation In general. Then, 

Sn Section 2, an attempt Is made to analyze emergent programs of educa- 
tional change and to Identify the types of decisions for which eval- 
uations are needed in these programs. Section 3 contains outlines 
of four strategies for evaluating educational programs, and the 
paper is concluded in Section 4 with an attempt to outline the structure 
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of evaluation design. To begin, l want to suggest a general rationale 
for the use of evaluation. 

The General Nature of Evaluation 

A Rat Iona le 

If decision-makers are to make maximum, legitimate use of their 
opportunities, they must make sound decisions regarding the alternatives 
available to them. To dc this, they must know what alternatives are 
available and be capable of making sound judgments about the relative 
merits of the alternatives. This requires relevant Information. 
Decision-makers should, therefore, maintain access to effective means 
for providing this Information, Otherwise, the!r decisions are likely 
to be functions of many undesirable elements. Under the best of circum- 
stances, judgmental processes are subject to human bias, prejudice and 
vested Interests, Also, there Is frequently a tendency to over-depend 
upon personal experiences, heresay evidence, and authoritative opinion; 
and, surely, all too many decisions are due to Ignorance that viable 
alternatives exist. Clearly, the quality of programs depends upon the 
quality of decisions In and about the programs; the quality of decisions 
depends upon decision-makers 1 abilities to Identify the alternatives which 
comprise decision situations and to make sound judgments of those alter- 
natives; making sound judgments requires timely access to valid and 
reliable Information pertaining to the alternatives; and the availability 
of such information requires systematic means to provide It, The pro- 
cesses necessary for providing this Information for decision-making 
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collectively compulse the concept of evaluation. Given this rationale, 

I will now attempt to define what I mean by evaluation. 

Eva luat ion Defined 

Generally, evaluation means the provision of Information through 
formal means, such as criteria, measurement, and statistics to supply 
rational bases for making judgments which are inherent In decision 
situations. To clarify this definition, It will be useful to define 
several key terms, A decision Is a choice among alternatives. A 
decision situation Is a set of alternatives. Judgment Is the assign- 
ment of values to alternatives, A criterion Is a rule by which values 
are assigned to alternatives, and optimally such a rule Includes the 
specification of variables for measurement and standards for use In 
judging that which Is measured. Statistics Is the science of analyzing 
and Interpreting sets of measurements. And, measurement Is the assign- 
ment of numerals to entitles according to rules, ard such rules usually 
Include the specification of sample elements, measuring dev > s and 
conditions for administering and scoring the measuring devices. Stated 
simply, evaluation Is the science of providing information for decision- 
making , 

The methodology of evaluation includes four functions: colle c t ion , 

o rganl zat ion f anal vs is . and reporting of Information. Criteria for 
assessing the adequacy of evaluations Include val ldl t y (Is the Informa- 
tion what the d eel s Ion- maker needs?), rel 1 ab 1 1 1 ty (Is the Information 
reproducible?) > ti me! ln ess (Is the hformatlon available when the 
decl s Ion-maker needs It?), pervasive ness (does the Information 
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reach all decl sion-mukers who reed it?), and credi bl I i ty (is the 
information trusted by the dec! slon-maker end those he must serve?). 

Eval uat ion i n Fj olds Other Tha n Education 

The corcept of evaluation as defined above is general, since 
the assigning of values to alternatives is common to all forms of 
human thought and activity, and since men have always sought to es- 
tablish rational defensible bases for their judgments. However, 
there are many kinds of evaluation which meet the conditions of the 
above definition, but which nevertheless may be distinguished one 
from the other. For example, market research, cost benefit analysis, 
experimental design, objective testing, operational analysis, 
operations analysis, operations research, Program Evaluation and 
Review Technique, Program Planning and Budgeting System, quality 
control, and systems analysis all fit the general definition of 
evaluation given above. Each of these modes of inquiry is the 
application of systematic means to aid in the assignment of values 
to the alternatives in decision situations. These different kinds 
of evaluation may be differentiated bv the decision situations they 
serve, the settings within which the decisions are made, the kinds 
of tools and techniques used, the level of precision In the Informa- 
tion collection and analytical modes, and the methodological skills 
of those who conduct the evaluations and those who are served by 
the evaluations, These substantive and methodological differences 
probably explain why different names have been given to these 
different forms of evaluation. For example, consider the following 
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statement by tuade: "Evaluations undertaken to enable decision- 

makers to choose among systems, to discover whether a given system 
would accomplish Its objectives, or to set up a framework within 
which tests of a system could be prepared came naturally to be called 
'systems anaiysl s , In 9 While Quade acknowledged that systems analysis 
Is a form of evaluation, he also noted that the name systems analysts 
was derived from the nature of this form of evaluation. 

Historical review of the more highly developed forms of eval- 
uation listed above reveals that each was developed for relatively 
specific appl icat Ions, Program Evaluation and Review Technique 
was developed to aid the military in making decisions in the develop- 
ment of complex weapon systems. Systems analysis w 35 developed to 
aid the military in making decisions In the development and imple- 
mentation of military operations. Experimental design was especially 
useful for making judgments about the relative merits of agricul- 
tural products. And, initially, objective testing was utilized 
largely as an aid to the military in selecting men for military 
service. Clearly, the development of each of these forms of eval- 
uation was precipitated by critical decision-making needs; and these 
forms of evaluation were thus based upon the types of decisions to be 
served and the settings within which they were to be made. Ne;» 

^Edward S, Quade, Editor, Analysis for Militar y D ecisions . 

Rend McNally and Company, Chicago, 19^7* P. 4. 
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approaches to evaluation were developed because extant approaches did 
not fit the decision-making requirements as precisely as needed, and 
because the decisions to be made could have serious consequences if 
wrong choices were made. Military decisions could effect the outcome 
of wars; thus, operations research, systems analysis, etc, were developed. 
Business decisions could result in profit, loss, or bankruptcy for 
thousands of stockholders; thus, cost-benefit analysis and market 
research were developed* 

Evaluation in Education 

In the past, decisions about education have had effects less 
tangible than those in business, agriculture and the military. Thus, 
there have not been pressures in education equivalent to those in 
other fields to motivate the development of highly specialised forms 
of evaluation to serve well defined classes of educational decisions. 
Indeed, most educators would be hard pressed to identify and define 
the critical decision situations in education which merit specialized 
means for evaluation. It cannot be said, however , that education has 
been devoid of evaluation practices. Standardized testing has been 
developed to a high art to aid in college entrance decisions, the 
passing or failing of students, the assignment of diplomas and degrees, 
and the placement of students in educational programs. The 8uros 
M ental Measurement Ycarbooks ^have been developed to aid educators in 
the selection and use of tests. And, recently, Project EPIE (Educa- 

'^Oscar K, Buros. The Buros Mental Measurement Yearbooks , 

Highland Park, New Jersey: The Gryphon Press, 1 9^9* 
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tlonal Products Information Exchange)^ has been developed to assist 
educators In selecting from arrong alternative products which are re- 
lated to education. Generally, however, educators have failed to 
develop specialized means to aid their decisions about programs. 

A prevalent position In education has been to avoid ''reinventing 
the wheel," but Instead to look to other fields where problems similar 
to those In education have been faced and solved. This reasoning has 
led educators to adopt such evaluation modes as experimental design* 
Here a technique, previously utilized to assist farmers to select from 
among alternative kinds of fertilizer and seed, is being used to 
assist educators to select from among alternative educational Inno- 
vations. The analogy between educational Innovations and fertilizer 
is hopefully remote. More recent forms of such borrowings are those 
of Program Evaluation and Review Technique, systems analysis, and 
the Program Planning and Budgeting System. At this point l would 
like to note that selective borrowing from other t.elds can save 
educators a great deal of time and effort. However, I also want to 
caution that wholesale, non-se I ect Ive borrowing of techniques from 
other fields can result in the misapplication of techniques which 
never were intended for and do not fit educational situations. I 
think that educators' use of experimental design to evaluate inno- 
vative programs is an example of what can happen in the latter case. 
The use of experimental design In such applications has cost educators 
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much time and effort without yielding much assistance for decision- 
making. 

As stated earlier in the paper, I think educators need some 
new basic conceptua I i zat ions to enable development of evaluation 
theory and methodology which has specific relevance to educational 
problems. In the previous section i have suggested a general rationale 
and definition for evaluation. Now I wi 1 1 attempt to derive a 
rationale and definition for evaluations in education* 

A Rationale for Educat iona I Eva 1 uat ion 

The Title I and Title ill programs of the Elementary and 
Secondary Education Act of 19&5 provide a comprehensive, timely con- 
text for deriving a rationale for educational evaluation. Virtually, 
ever/ school district in the nation is involved with one or both of 
these programs. The purposes of these programs respectively are to 
increase the educational attainment, experiences, and opportunities 
of disadvantaged children; and to increase the amount and quality of 
innovation in local education agencies. Both programs are national 
in scope, design, arid broad control. They are coordinated and speci- 
fically controlled at the state level and are implemented In local 
school districts. Together, they provide more than b ! 11 Ion 
dollars annually to local education agencies. 

Figure l contains a conceptualization of the process and de- 
cision functions of evaluation as they may exist in federal assis- 
tance programs such as the Title I and Title ill programs, A set 
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Feedback Control Loop: 

Evaluation *in Federally Supported 
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of feedback control loops Illustrate the relationships among local, 
state, and national evaluations of activities of federal assistance 
programs. In Figure l, the loop at the right shows local school 
activities; the center loop, state activities; and the left loop, 
federal activities. Each loop contains a set of blocks, varied in 
shape, which represent the major evaluation functions, 

81ock I portrays the local school district's program. This is 
the local context from which needs for educational change emerge and 
within which the changes to meet these needs must ultimately occur. 

It Includes the inputs of the system, e,g,, the learners, curriculum, 
staff, organization, policies, finances, physical facilities, and 
school -communl ty relations, and the outputs of the system, i,e., the 
cognitive, psychological, physical, and social funct ioni ng of its 
students and alumni. 

To the right of Block I, information collection is depicted 
by the first segment of curved line. This is a systematic collection 
at the local level of all information needed for later decisions 
at local, state, and federal levels. 

Block 2 depicts the organization of Information. Here, inform- 
ation would be coded according to predetermined categories, pro- 
cessed, e.g,, keypunched, filed regularly, and retrieved as needed. 

At Block 3, information organized at Block 2 would be 
analyzed according to decision-making requirements at local, state 
and national levels and reported to local and state decision-makers. 
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Block 4 denotes program decisions made at the local level. Local 
school decision-makers to be served by the evaluation Include the 
Board of Education, the school administration, project supervisors, 
teachers and principals. 

The decisions made at Block 4 would be Implemented at Block 5, 
thus reactivating the cycle with frequent modification of the school 
program at Block L This cycle Is continuous. 

Returning to Block 3, evaluation reports for the state education 
department would be prepared annually by all public school districts 
in the state. At Block 6, the state education department would 
organize these reports Into types of projects and combine informa- 
tion from similar projects. This Information would then be analyzed 
at Block 7 to determine the strengths and weaknesses of the state- 
wide program. The state program off Iclals would use this Informa- 
tion to assess the statewide educational needs and problems to make 
decisions about program emphases and state control at Block 8, 
Decisions made at 8lock 8 would be Imp lemented at Block 9, affec- 
ting the state program at Block 10, and reactivating the Cycle at 
Block 1 , 

At Block 7» annual product evaluation reports from fifty states 
would be sent tc the federal agency. This Information would then be 
organized at Block 1 1 , so that major program thrusts could be examined 
and analyzed on a nationwide basis at Block 12 and so that reports 
could be prepared for the Associate Com/nlss lcner for 
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Elementary and Secondary Education* the Commissioner of Education* 
the Secretary of Health, Education and Welfare, the President, and 
the Congress, Decisions about program emphases and funding would be 
made at the federal level at 81ock 13 and implementation of such 
decisions at Block 14 would affect the federal program at Block 15 t 
the stat^ program at Block 10, and the local school project at 81ock 
1, thuj, reactivating the cycle. 

Summarized, Figure 1 demonstrates: (1) Information for evalua- 

tion at federal, state, and local levels will be collected largely 
at the local level; (2) this Information will form the basis for 
federal, state, 3nd local decisions which will ultimately affect 
local operations; and (3) evaluation plans must be developed, com- 
municated, and coordinated at federal, state, and local levels if 
the Information schools provide Is to be adequate for assisting in 
the decision process at eath of these levels. 

Obviously, to develop an appropriate evaluation system for programs 
such as Title I and Title IU one must first have some knowledge of the 
decision situations to be served. Optimally, such knowledge of decision 
situations should answer several questions. First, one should identify 
the J_o_ru£ of decision-making, in terms of the level (s) at which author- 
ity and responsibility for decision-making are vested, e,g,, local, 
state and/or national and within each of these level?. Second, It Is 
desirable to Identify the focus of the decisions -- are they re- 
lated to goals of research, development, training, diffusion, etc,? 
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Third, one needs knowledge of the sub stance of the decisions (are 
they related to mathematics, language arts, etc, and what are the 
alternatives in each decision situation?). Fourth, one needs to 
know the f unct ion of the dec l sions--are they for the planning, pro- 
graming, implementing or recycling of activities? Fifth, one needs 
knowledge of the objects of the decisions (e.g., persons, places, 
events, or things?) Sixth, one obviously needs advance knowledge of 
the t iming of decisions. And, finally, one needs knowledge of the 
relative c r 1 c leal 1 1 y of decisions. 

Considering all of the declslon-maklng variables I have listed 
above, it is clear that one could I dent i fy many, many different 
kinds of educational decision situation in education. Thus, it 
would also be possible to identify many different kinds of evalu- 
ation. However, it should prove more useful to develop a parsimonious 
classification system for kinds of educational evaluation which is 
intermediate between the general conceptual definition of evaluation 
given above and the many specific applied kinds of evaluation which 
could be derived from the use of all of the above named variables in 
a detailed analysis and classification of education decision situ- 
ations, Then it should be possible to derive useful names for the 
Identified classes of educational evaluation. 

To assist in developing a parsimonious classification system 
for educational decision situations In programs such as Title I and 
Title Ml, 1 have found It useful initially to focus exclusively on 
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the functfons of decisions.^ | would postulate that functions of 
decision situations in education may be classified as planning , pro - 
graming , implementing and recycl i ng . Planning decisions are those 
which focus needed improvements by specifying the domain, major goals, 
cH specific objectives to be served. Programing decisions specify 
procedure, personnel, facilities, budget, and time rcguirenents for 
implementing planned activities. Implement j ng decisions are those 
in directing programed activities. And, recycl j ng decisions include 
terminating, continuing, evolving, or drast ica) I y modi f yi ng activities. 

Four Strategies for Evaluating Educational Programs 
Given these four kinds of educational decisions to be served, 
there are also four kinds of evaluation. These are portrayed In 
Figure 2 as context, input, process, and product evaluation. Con- 
text evaluation would be used when a project is first being planned. 

I nput evaluation would be used Immediately after context for specific 
programing of activities. Process evaluation would be used con- 
tinuously during the implementation of the project. Product evalu- 
ation would most likely be used after a complete cycle of the pro- 
ject. Each ot these kinds of evaluation will be considered individ- 
ual 1 y. 

Context Evaluation 

The major objective of context evaluation is to define the 
1 2 

Daniel L. Stufflebeem. 'T'he Use and Abuse of Evaluation In 
Title I 1 1 1 1 , Theory I nto Pract 1 ce . Col lege of Education, The Ohio State 
University, Volume. VI, Number 3, June 1967. 
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A Classification Scheme of Strategies for Evaluating Educational Change 

The Strategies 
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environment where change is to occur, the environment's unmet needs, 
and the problems underlying those needs. For example, the environ- 
ment may be efined as the inner city elementary schools of a large 
metropolitan eea. Study of such a setting might reveal that the 
actual reading achievement levels of children in this area are far 
below what the school system expects for them. This would be the 
identification of a need, i.e. f the context evaluation would have 
revealed that the children's reading achievement levels need to be 
raised. As a next step In the context evaluation the school would 
attempt to identify the reasons for such a need, Are the students 
receiving adequate i nsE ruct ion? Are the Instructional materials 
appropriate for them? Is there a major language barrier? Is there 
a high incidence of absenteeism? !s the school's expectation for 
these students reasonable? Etc. These are what I mean by potential 
problems, T ey are potential d r Jemmas which prevent the achievement 
of desired goals and thereby result in the existence of needs, 

The method of context evaluation begins with a conceptual 
analysis to identify and define the limits of the domain to be 
served as well as its major subparts. Not, empirical analyses are 
performed, using techniques such as sample survey, demography, and 
standardized testing. The purpose of this part of context evaluation 
is to identify the discrepancies among intended and actual situations 
for each of the subparts of the domain of interest arid thereby to 
ident y ncech. Finally, context evaluation involves both em- 
pirical and conceptual analyses, as well as appeal to theory and 
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author i tat I ve opinion, to aid judgments regarding the basic problems 
underlying each need. 

Decisions served by context evaluation include deciding upon the 
setting to be served, the goals associated with meeting needs, and the 
objectives associated with solving problems. Such decisions usually 
appear In the introductory sections of proposals to funding agencies 
or in requests for proposals by funding agencies. 

Input Eva I uat ion 

To determine how to utilize resources to meet program go^ls and 
objectives, it Is necessary to do an input evaluation. Its objective 
Is to identify and assess relevant capabilities of the proposing 
agency, strategies which may be appropriate for meeting program goals 
and designs which may be appropriate for achieving objectives associated 
with each program goal. The end product of input evaluation Is an 
analysis of alternative procedural designs in terms of potential costs 
and benefits. Spec 1 f 1 ca i 1 y, alternative designs are assessed in 
terms of their resource, time and budget requirements; their poten- 
tial procedural barriers; the consequences of not overcoming these 
barriers; the possibilities and costs of overcoming them; relevance 
of the designs t^ program objectives; and overall potential of the 
design to meet program goals. Essentially, Input evaluation provides 
information for deciding whether outside assistance should be 
sought for meeting goals and objectives, what strategy should be 
employed, e.g # , the adoption of available solutions or the develop- 
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ment of new ones, and what design or procedural plan should be employed 
for Implementing the selected strategy. 

Methods for Input evaluation are lacking In education. The pre- 
valent practices include committee deliberations, appeal to the pro- 
gesslonal literature, and the employment of consultants. In a few 
areas, formal Instruments exls to aid decision-makers in making Input 

decisions. In the design of ting programs, one may obtain substan- 

13 

tlai help by referring to th° B uros Mental Measurements Ye arbooks . 

The educational researcher, who wants to select an experimental deslgni 
can receive material assistance In Identifying and assessing alterna- 
tive experimental designs by referring to the Campbel 1-Stanley chapter 

14 

on experimental design In Gage's Handbook on Research In Teaching . 

In this chapter, the decision situation posed to the researcher In need 
of an experimental design i : \eatly laid out In the form of alternative 

designs which are relevant to expo' imento research. Each of these 
designs Is rated regarding Its potent!*! to meet criteria of Internal 
and external validity, Further, procedural barriers or sources of 
Invalidity are identified for each of the listed designs. 

■ 1 - 

Decisions based upoh Input evaluation os’.isli/ result lr the 
specification of procedures, materials, facilities, schedule, staff 



^Buros, cp, clt. 



N. L. Gage, Editor, Handbook of Researc --. on Teaching . The 
American Educational Research Association, Chicago: Rand McNally and 

Company, 19&3. 
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requirements, and budgets In proposals to funding agencies. From 
the Information provided In the proposals, the funding agencies In 
turn do an Input evaluation to determine whether or not to fund 
the proposed projects. Funding agencies commonly employ expert 
consultants to serve as judges In their Input evaluations# 

Process Evaluation 

Once a designed course of action has been approved and Imple- 
mentation of the design has begun, process evaluation Is needed to 
provide periodic feedback to project managers and others responsible 
for continuous control and refinement of plans and procedures# The 
r jjective of process evaluation Is to detect or predict, during 
the Implementation stages, defects In the procedural design or Its 
Implementation, The overall strategy is to identify and monitor, on 
a continuous basis, the potential sources of failure in a project. 
These include Interpersonal relationships among' staff and students; 
communication channels; logistics; understandings of and agreement 
with the Intent of the program by persons involved In and affected 
by It; adequacy of the resourcas, physical facilities, staff, and 
time schedule; etc. 

As opposed to experimental design evaluation, process evaluation 
does not require control over assignment of subjects to treatments, 
nor that the treatments be held constant. Its purpose Is to assist 
project personnel to make their decisions a bit more rational In 
thel; continual effoi *s to Improve the quality of the program. 
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Thus, under process evaluation, the evaluator accepts the program 
as It Is and as it evolves, and monitors the total situation as 
best he can by focusing the most sensitive and non- Intervening data 
collection devices and techniques that he can obtain on the most 
crucial aspects of the project* Such evaluation ts multivariate, 
and not all of the important variables can be specified before a 
project Is Initiated. The process evaluator focuses h1» attention 
on theoretically important variates, but he also remains alert to 
any unanticipated but significant events. Under process evaluation, 
information Is collected dally, organized systematically, analyzed 
periodically, e.g., weekly, and reported as often as project personnel 
require such information, e,g,, monthly. 

Thus, project decision-makers are not only provided with infor- 
mation needed for anticipating and overcoming procedural difficulties, 
bv.t also with a record of process Information to be used later for 
interpreting project outcomes, 

Product Evaluation 

Product evaluation Is used to determine the effectiveness of the 
project after It has run full cycle. Its objective Is to relate out- 
comes to objectives and to context, lnpct, and process, l.e., to 
measure and Interpret outcomes. 

The method is to operationally define and measure criteria 
associated with the objectives of the activity, to compare these 
measurements with predetermined cbsolute or relative standards, end 
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to make rational interpretations of the outcomes using the recorded con- 
text, input, and process Information. Criteria for product evaluatlor 
may be either instrumental or consequential, a distinction pointed out 
earlier by Scrlven.^ Instrumental criteria are related to program 
outcomes which contribute to the achievement of behavioral objectives, 
Clark find Guba have developed a taxonomy of instrumental objectives and 
associated criteria which are related to educational change.*^ An adap- 
tation of their scheme is presented as Figure 3. Consequential criteria 
are primarily those pertaining to behavioral objectives. Bloom's 
Taxonomy of Educationa l Objectives ^* is useful In the Identification 
of consequential objectives. 

In the change process, product evaluation provides Information 
for deciding to continue, terminate, mod I f y or refocus a change 
activity, and for linking the activity to other phases of the change 
process. For example, a product evaluation of a program to develop 
after school study for students from disadvantaged homes might show 
that the development objectives have been satisfactorily achieved 



^Michael Scrlven. The Methodology of Eval uat ion , Bloomington, 
Indiana: Indian: University, Social Science Education Consortium, 

Publ Icat Ion #110, 1965. 

^Oavld L, Clark and Egon G, Guba. "An Examination of Potential 
Change Roles in Education," Paper read at the Symposium on Innovation in 
Planning School Curricula, Airl ie House, Virginia, October, 1965, 

^8enjamin S, Bloom, Taxonomy of Educational Objectiv es: The 

Class If icatlon of educational G oal s . Handbook Jh Cogni tive b ^ma 1 n , New 
Yo. k ; Longmans, Green and Company, Inc., 1956, 
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and that the developed innovation Is ready to be diffused to other 
schools which need such an Innovation, 

Given these four kinds of evaluation It is next necessary to 
consider methodology for Implement i ng them. This problem is considered 
in the next section of this paper. 

The Structure of Evaluation Design 

Once an evaluator has selected an evaluation strategy, e.g., 
context, Input, process, or product, he must next select or develop 
a design to implement his evaluation. This Is a difficult task since 
few generalized evaluation designs exist which are adequate to meet 
emergent needs for evaluation. Thus, educators must typically develop 
evaluation designs de novo. The remainder of this paper is an 
attempt to provide a general guide for developing evaluation designs. 
Specifically, I w* 1 1 attempt to define design in general terms and 
to explicate the general structure of designs for educational evalu- 
ation, hopefully, this general treatment of evaluation design will 
be of soma help to educators in ordering their minds as they 
approach problems of designing evaluations. Also, I am hopeful that 
the following material might stimulate methodologists who are more 
capable than I to develop generalized designs for context, Input, 
process, and product evaluation. 

Desl gn Pef I ned 

In general, design is the preparation of a set of decision 
situations for Implementation toward the achievement of specified 
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objectives. This definition says three things. First, one must 
Identify the objectives to be achieved through Implementation of the 
design. In a product evaluation, for example, such an objective 
might be to make a determination of whether all students in a remedial 
reading program attained specified levels of specific reading skills. 
Second, this definition says that one should identify and define the 
decision situations in the procedure for achieving the evaluation 
objective. For example, In the remedial reading case cited above 
one would want to identify the available measuring devices which 
might be appropriate for assessing the specified reading skills. 

Third, for each identified decision situation the evaluator needs to 
make a choice among the available a I temat i ves. Thus, the completed 
evaluation design would contain a set of decisions as to how the 
evaluation Is to be conducted and what instruments will be used. 

It should he useful to evaluators to have available a list of 
the decision situations which are common to many evaluation designs. 
This would enable them to approe.'h problems of evaluation design In 
a systematic manner. Further, such a list could serve as an outline 
for the content of evaluation sections In research and development 
proposals. Funding agencies should also ffnt such a list useful 
In structuring their general guidelines for evaluations which they 
provide to potential rr 0 p 0sa | writers. Also, surh a list should be 
useful to training agencies for defining the role of the evaluation 
special i st , 
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Figure 4 is an attempt to provide such a general list of 
decision situations for evaluation designs. 8y presenting this 
general list I am asserting that the structure of evaluation design 
Is the same for context, input, process, or product evaluation. 

This structure includes six major parts. These are I) focusing the 
evaluation, 2) information collection, 3) Information organization, 

4) Information analysis, 5) Information reporting, and 6) the 
administration of evaluation. Each of these parts will be considered 
separately. 

Focus I nq the Eval uat i on 

The first part of the structure of evaluation design Is that 
of focusing the evaluation. The purpose of this part is to spell 
out the ends for the evaluation and to define policies within which 
the evaluation must be conducted. Specifically, this part of evalu- 
ation design Includes four steps. 

The first step Is to Identify the major levels of decision- 
making for which evaluation Information must be provided. For ex- 
ample, In the Title III program of the Elementary and Secondary 
Education Act evaluative Information from local schools Is needed 
at local, state and national levels. It Is important to take ail 
relevant levels Into account in the design of evaluations since 
different levels may have different information requirements and 
since the different agencies may need information at different times. 

Having Identified the major levels of dec! si on-makf ng to be 
served by evaluation, the next step Is to identify and define the 
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DEVELOPING EVALUATION DESIGNS 



The logical structure of evaluation design Is the same for all types of evatua* 
tion, whether context, input, process or product evaluation. The parts, briefly, 

arc as follows: 

A . Fo cusi ng the Eval uat ion 

1. identify the major level (s) of decision-making to be served, e.g if 
local, state, or national. 

2. For each level of dec I s Jon-maklng, project the decision situations to 
be served and describe each one In terms of its locus, focus, criti- 
cality, timing, anc composition of alternatives. 

3. Define criteria for each decision situation by specifying variables 
for measurement anc standards for use In the judgment of a 1 terna 1 1 ves • 

4. Define policies within which the evaluation must operate. 

Coll ection of 1 nformat I on 

1. Specify the source of the information to be collected- 

2. Specify the Instruments and methods for collecting the needed informa- 
tion. 

3. Specify the sampling procedure to be employed. 

4- Specify the conditions and schedule for Information collection. 

^ « 2-L9A n l?at Ion of Informat Ion 

1- Provide a format for the information which Is to be collected. 

Designate a means for coding, organizing, storing, and retrieving 
information. 

D. ArigXysj^s of Informatio n 

K Select the analytical procedures to be employed, 

2. Designate a means for performing the analysis, 

E . ^pgrtjr^g of_ In format ion 

1. Define the audiences for the evaluation reports, 

2. Specify means for providing information to the audiences. 

3. Specify the format for evaluation reports and/or reporting sessions, 

4. Schedule the reporting of Information. 

F. Admjjlis trat the Evaluation 

1. Summarize the evaluation schedule. 

2. Define staff and resource requirements and plans for meeting these 
requ | remen ts, 

?• Specify means for meeting policy requirements for conduct of the 
I uat Ion , 

'h Evaluate the potential of the evaluation design for providing Infor- 
mation which Is valid, reliable, credible, timely, and pervasive, 

« Specify schedule means for periodic updating of the evaluation 
dcs [ gn , 

6, Provide a budget for the total evaluation program, 
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decision situations to be served at each level. Given rur present 
low state of knowledge about dec J s lon-ma king In education, this Is a 
very difficult task. However, It Is also a very important one and 
should be done as well as Is practicable. First, decision situations 
should be Identified in terms of those responsible for making the 
decisions, e,g,, tetcher, principals, the board of education members, 
state legislators, etc. Next, major types of decision situations 
should be Identified, e.g,, appropr iat ional , a 1 locat Iona 1 , approval, 
or continuation. Then these types of decision situations should be 
classified by focus, e.g., research, development, diffusion or adoption 
In the case of instrumental outcomes, or knowledge or understand Ing in 
the case of consequent i al outcomes, (This step Is especially helpful 
tov/ard Identifying relevant evaluative criteria.) These Identified 
decision situations should then be analyzed in terms of their relative 
criticality. In this way relatively less Important decisions which would 
expend evaluation resources needlessly can be eliminated from further 
consideration. Next, the timing of the decision situation to be served 
should be estimated so that the evaluation can be geared to provide rele- 
vant data prior to the time when decisions must be made. And, finally, 
an attempt shojld be made to explicate each important decision situa- 
tion in terms of the alternatives which may reasonably be considered 
In reaching the decision. 

Once the decision situations to be served have been explicated, 
the next step Is to define relevant Information requirements. Speci- 
fically, one should define criteria for each decision situation by 
specifying variables for measurement and standards for use lr the 
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judgment of alternatives. 
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The final step in focusing the evaluation is to define policies 
within which the evaluation must operate. For example, one should 
determine whether a "self evaluation 1 * or "outside evaluation" is 
needed. Also, It is necessary to determine who will receive evalu>- 
ation reports and who will have access to them. Finally, it is 
necessary to define the limits of access to data for the evaluation 
team, 

Co 1 1 ect ion of Informat ion 

The second major part of the structure of evaluation design 
is that of planning the collection of information. This section 
must obviously be keyed very closely to the criteria which were 
identified in the Evaluation Focus part of the design. 

Using those criteria one should first identify the sources of 
the information to be collected. These information sources should 
be defined in two respects: first, the origins for the information, 

e.g., students, teachers, principals or parents, and second, the 
present state of the information, i.c,, in recorded or non-recorded 
form. 



Next, one should specify instruments and methods for collecting 
the needed information. Examples include achievement tests, inter- 
view schedules and searches through the professional literature. 
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Michael and Metfessel have recently provided a comprehensive list 
of instruments with potential relevance for data collection In 
eval uat ions. 

For each instrument that is to be administered, one should next 
specify the sampling procedure to be employed. Where possible, one 
should avoid administering too many Instruments to the same person. 
Thus, sampling without replacement across instruments can be a useful 
technique. Also, where total test scores are not needed for each 
student, one might profitably use multiple matrix sampling where 
no student attempts more than a sample of the items in a test. 

Finally, one should develop a master schedule for the collection 
of information. This schedule should detail the Interrelations 
between samples, i nst rumeats, and dates for the collection of inform- 
at ion. 

Orga n i zat ion of I nfo rmat i on 

A frequent disclaimer in evaluation reports is that resources 
were inadequate to allow for processing all of the pertinent data. 

If this problem is not to arise, one should make definite plans 
regarding the third part of evaluation design: Organization of 

1 8 

“Newton S. Metfessel and William B. Michael. "A Paradigm 
Involving Multiple Criterion Measures for the Evaluation of the 
Effectiveness of School Programs*" Educational and Ps y cholool ca 1 
Measurem ent , 1967, 27, 931-936. 
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Information. Organizing the Information that Is to be collected 
Includes providing a format for classifying Information and desig- 
nating means for coding, organizing, storing, and retrieving the 
i nformat i on. 

A nal ys i s of I nformat Ir n 

The fourth major part of evaluation design is analysis of 
Information, The purpose of this part is to provide for the 
descriptive or statistical analyses of the information which is to 
be reported to dec i s ion-makers . This part also Includes Interpre- 
tations and recommendations. As with the organization of inform- 
ation it is important that the evaluation design specify means 
for performing the analyses. The role should be assigned specifically 
to a qualified member of the evaluation team or to an agency 
which specializes in doing data analyses. Also, it is important 
that those who will be responsible for the analysis o f information 
participate in designing the analysis procedures, 

R eport i ng of I nformat ion 

The fifth part of evaluation design is the reporting of inform- 
ation. The purpose of this part of a design is to insure that 
decision-makers will have timely access to the Information they 
need and that they will receive It In a manner and form which facil- 
itates their use of the information. In accordance with the policy 
for the evaluation, audiences for evaluation r eports should be 
Identified and defined. Then neans should be defined for providing 
information to each audience. Subsequently, the format for evaluation 
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reports and reporting sessions should be specified. And, finally, 
a master schedule of evaluation reporting should be provided. This 
schedule should define the interrelations between audiences, reports, 
and dates for reporting information. 

Admi n \ st rat ion of Eva 1 uat ion 

The last part of evaluation design is that of administration 
of the evaluation. The purpose of this part is to provide an overall 
plan for executing the evaluation design. The first step is to de- 
fine the overall evaluation schedule. For this purpose it often 
would be useful to employ a scheduling technique, such as Program 
Evaluation and Review Technique, The second step is to define staff 
requirements and plans for meeting these requirements. The third 
step is to specify means for meeting policy requirements for conduct 
of the evaluation. The fourth step Is to evaluate the potential of 
the evaluation design for providing information which is valid, 
reliable, credible, timely, and pervasive. The fifth step is to 
specify ord schedule means for periodic updating of the evaluation 
design. And, the sixth and final step is to provide a budget for 
the evaluation. 

Finally, 1 have reached the end of my paper. While I have only 
scratched the surface regarding educational evaluations, it Is clear 
to tre the; the design and analysis of educational evaluation is a 
rrost complex and difficult undertaking. Surely, all of us who are 
WTTMttcd to reshaping the viorld of educational evaluation rrust work 
very, very hard if vie are to rrake any progress, if progress is rot 



51 



made in this area, i am convinced that education 
for want of adequate information to support vital 
about education. 




will be a casual ty 
deci s ions t n and 
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